Learning User Preferences to Incentivize Exploration in the Sharing Economy

نویسندگان

Christoph Hirnschall

Adish Singla

Sebastian Tschiatschek

Andreas Krause

چکیده

We study platforms in the sharing economy and discuss the need for incentivizing users to explore options that otherwise would not be chosen. For instance, rental platforms such as Airbnb typically rely on customer reviews to provide users with relevant information about different options. Yet, often a large fraction of options does not have any reviews available. Such options are frequently neglected as viable choices, and in turn are unlikely to be evaluated, creating a vicious cycle. Platforms can engage users to deviate from their preferred choice by offering monetary incentives for choosing a different option instead. To efficiently learn the optimal incentives to offer, we consider structural information in user preferences and introduce a novel algorithm Coordinated Online Learning (CoOL) for learning with structural information modeled as convex constraints. We provide formal guarantees on the performance of our algorithm and test the viability of our approach in a user study with data of apartments on Airbnb. Our findings suggest that our approach is well-suited to learn appropriate incentives and increase exploration on the investigated platform. Introduction In recent years, numerous sharing economy platforms with a variety of goods and services have emerged. These platforms are shaped by users that primarily act in their own interest to maximize their utility. However, such behavior might interfere with the usefulness of the platforms. For example, users of mobility sharing systems typically prefer to drop off rentals at the location in closest proximity, while a more balanced distribution would allow the mobility sharing service to operate more efficiently. Undesirable user behavior in the sharing economy is in many cases even self-reinforcing. For example, users in the apartment rental marketplace Airbnb are less likely to select infrequently reviewed apartments and are therefore unlikely to provide reviews for these apartments (Fradkin 2014). This is also reflected in the distribution of reviews, where in many cities 20% of apartments account for more than 80% of customer reviews1. *Work performed while at ETH Zurich. Copyright © 2018, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. Data from insideairbnb.com. Such dynamics create a need for platforms in the sharing economy to actively engage users to shape demand and improve efficiency. Several previous papers have proposed the idea of using monetary incentives to encourage desirable behavior in such systems. One example is (Frazier et al. 2014), who studied the problem in a multi-armed bandit setting, where a principal (e.g. a marketplace) attempts to maximize utility by incentivizing agents to explore arms other than the myopically preferred one. In their setting, the optimal amount is known to the system, and the main goal is to quantify the required payments to achieve an optimal policy with myopic agents. The idea of shaping demand through monetary incentives in the sharing economy has also been tested in practice. For example, (Singla et al. 2015) use monetary incentives to encourage users of bike sharing systems to return bikes at beneficial locations, making automatic offers through the bike sharing app. In this context, an important question is what amounts a platform should offer to maximize its utility. (Singla et al. 2015) introduce a simple protocol for learning optimal incentives in the bike sharing system to make users switch from the preferred station to a more beneficial one, ignoring information about specific switches and additional context. Extending on these ideas, we explore a general online learning protocol for efficiently learning optimal incentives. Our Contributions We provide the following main contributions in this paper: • Structural information: We consider structural information in user preferences to speed up learning of incentives, and provide a general framework to model structure across tasks via convex constraints. Our algorithm, Coordinated Online Learning (CoOL) is also of interest for related multi-task learning problems. • Computational efficiency: We introduce two novel ideas of sporadic and approximate projections to increase the computational efficiency of our algorithm. We derive formal guarantees on the performance of the CoOL algorithm and achieve no-regret bounds in this setting. • User study on Airbnb: We collect a unique data set through a user study with apartments on Airbnb and test the viability and benefit of the CoOL algorithm on this dataset. Preliminaries In the following, we introduce the general problem setting of this paper. Platform. We investigate a general platform in the sharing economy, such as the apartment rental marketplace Airbnb. On this platform, users can choose from n goods and services, denoted as items. A user that arrives at time t chooses an item i ∈ [n]. If the user chooses to buy item i, the platform gains utility ui. Incentivizing exploration. The initial choice, item i, might not maximize the platform’s utility, and the platform might be interested in offering a different item j with utility uj > u t i instead. For example, j could represent an infrequently reviewed item that the platform wants to explore. To motivate the user to select item j instead, the platform can offer an incentive p, for example in the form of a monetary discount on that item. The user can either accept or reject the offer p depending on the private cost c, where the user accepts the offer if p ≥ c and rejects the offer otherwise. If the user accepts the offer, the utility gain of the platform is uj − ui − p. Objective. In this setting, two tasks need to be optimized to achieve a high utility gain: finding good switches i → j, and finding good incentives p. Good switches i → j are those, in which the achievable utility gain is positive, i.e. uj − ui − c > 0. To realize a positive utility gain, the offer p needs to be greater or equal to c, since otherwise the offer would be rejected. In this paper, we focus on learning optimal incentives p over time, while the platform chooses relevant switches i→ j independently. Methodology In this section, we present our methology for learning optimal incentives pi,j and start with a single pair of items (i, j). We allow for natural constraints on pi,j , such that pi,j ∈ Si,j , where Si,j is convex and non-empty. For example, Si,j might be lower-bounded by 0 and upper-bounded by the maximum discount that the platform is willing to offer. Single Pair of Items We consider the popular algorithmic framework of online convex programming (OCP) (Zinkevich 2003) to learn optimal incentives pi,j for a single pair of items. The OCP algorithm is a gradient-descent style algorithm that updates with an adaptive learning rate and performs a projection after every gradient step to maintain feasibility within the constraints Si,j . We use τ t i,j to denote the number of times a pair of items i, j has been observed and η to denote the learning rate. To measure the performance of the algorithm, we use the loss l(pi,j), which is the difference between the optimal prediction and the prediction provided by the algorithm, such that l(p) = 1{pt≥ct} · (p − c) + 1{pt 0 2 Initialize: pi,j ∈ S, τ i,j = 0 3 for t = 1, 2, . . . , T do 4 Suffer loss l(pi,j) 5 Calculate gradient g i,j 6 Set τ t i,j = τ t−1 i,j + 1 7 Update p i,j = p t i,j − η √ τt i,j g i,j

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A social recommender system based on matrix factorization considering dynamics of user preferences

With the expansion of social networks, the use of recommender systems in these networks has attracted considerable attention. Recommender systems have become an important tool for alleviating the information that overload problem of users by providing personalized recommendations to a user who might like based on past preferences or observed behavior about one or various items. In these systems...

متن کامل

Assessment of user preferences of campus green space at Ferdowsi University of Mashhad-Iran

Researchers have found that a user’s perception of the campus environment is related to quality life and academic accomplishment. In this study, we have analyzed the perceptions of more than 600 users at the Ferdowsi University of Mashhad to evaluate the level of green space use and to understand user preferences from aesthetics and safety aspects. The results show that for most of the responde...

متن کامل

A Grouping Hotel Recommender System Based on Deep Learning and Sentiment Analysis

Recommender systems are important tools for users to identify their preferred items and for businesses to improve their products and services. In recent years, the use of online services for selection and reservation of hotels have witnessed a booming growth. Customer’ reviews have replaced the word of mouth marketing, but searching hotels based on user priorities is more time-consuming. This s...

متن کامل

Investigating the Adoption Rate of Students' Mental Model with the Structure of the Learning Management System of the University of Tehran by Card Sorting Method

Background and Aim: E-learning is an important topic in the educational settings and students are significant prerequisites of it, who have an essential role for the acceptance and effective use of e-learning management systems so that knowing their attitudes and mental models is essential for the successful implementation of such a method. Therefore, the aim of this study was to investigate...

متن کامل

Intuitive Network Applications: Learning for Personalized Converged Services Involving Social Networks

The convergence of the wireline telecom, wireless telecom, and internet networks and the services they provide offers tremendous opportunities in services personalization. We distinguish between two broad categories of personalization systems: recommendation systems, such as used in advertising, and life-style assisting systems, which attempt to customize or specialize services to an individual...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1711.08331 شماره

صفحات -

تاریخ انتشار 2017

Learning User Preferences to Incentivize Exploration in the Sharing Economy

نویسندگان

چکیده

منابع مشابه

A social recommender system based on matrix factorization considering dynamics of user preferences

Assessment of user preferences of campus green space at Ferdowsi University of Mashhad-Iran

A Grouping Hotel Recommender System Based on Deep Learning and Sentiment Analysis

Investigating the Adoption Rate of Students' Mental Model with the Structure of the Learning Management System of the University of Tehran by Card Sorting Method

Intuitive Network Applications: Learning for Personalized Converged Services Involving Social Networks

عنوان ژورنال:

اشتراک گذاری